Integrity Verification for Outsourcing Uncertain Frequent Itemset Mining

نویسندگان

  • Qiwei Lu
  • Wenchao Huang
  • Yan Xiong
  • Xudong Gong
چکیده

In recent years, due to the wide applications of uncertain data (e.g., noisy data), uncertain frequent itemsets (UFI) mining over uncertain databases has attracted much attention, which differs from the corresponding deterministic problem from the generalized definition and resolutions. As the most costly task in association rule mining process, it has been shown that outsourcing this task to a service provider (e.g.,the third cloud party) brings several benefits to the data owner such as cost relief and a less commitment to storage and computational resources. However, the correctness integrity of mining results can be corrupted if the service provider is with random fault or not honest (e.g., lazy, malicious, etc). Therefore, in this paper, we focus on the integrity and verification issue in UFI mining problem during outsourcing process, i.e., how the data owner verifies the mining results. Specifically, we explore and extend the existing work on deterministic FI outsourcing verification to uncertain scenario. For this purpose, We extend the existing outsourcing FI mining work to uncertain area w.r.t. the two popular UFI definition criteria and the approximate UFI mining methods. Specifically, We construct and improve the basic/enhanced verification scheme with such different UFI definition respectively. After that, we further discuss the scenario of existing approximation UFP mining, where we can see that our technique can provide good probabilistic guarantees about the correctness of the verification. Finally, we present the comparisons and analysis on the schemes proposed in this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Result Integrity Verification of Outsourced Frequent Itemset Mining

The data-mining-as-a-service (DMaS) paradigm enables the data owner (client) that lacks expertise or computational resources to outsource its mining tasks to a third-party service provider (server). Outsourcing, however, raises a serious security issue: how can the client of weak computational power verify that the server returned correct mining result? In this paper, we focus on the problem of...

متن کامل

An Audit Environment for Outsourcing of Frequent Itemset Mining

Finding frequent itemsets is the most costly task in association rule mining. Outsourcing this task to a service provider brings several benefits to the data owner such as cost relief and a less commitment to storage and computational resources. Mining results, however, can be corrupted if the service provider (i) is honest but makes mistakes in the mining process, or (ii) is lazy and reduces c...

متن کامل

Analysis of Frequent Item set Mining on Variant Datasets

Association rule mining is the process of discovering relationships among the data items in large database. It is one of the most important problems in the field of data mining. Finding frequent itemsets is one of the most computationally expensive tasks in association rule mining. The classical frequent itemset mining approaches mine the frequent itemsets from the database where presence of an...

متن کامل

Mining Frequent Itemsets over Uncertain Databases

In recent years, due to the wide applications of uncertain data, mining frequent itemsets over uncertain databases has attracted much attention. In uncertain databases, the support of an itemset is a random variable instead of a fixed occurrence counting of this itemset. Thus, unlike the corresponding problem in deterministic databases where the frequent itemset has a unique definition, the fre...

متن کامل

Accelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting

Frequent Itemset Mining (FIM) is one of the most investigated fields of data mining. The goal of Frequent Itemset Mining (FIM) is to find the most frequently-occurring subsets from the transactions within a database. Many methods have been proposed to solve this problem, and the Apriori algorithm is one of the best known methods for frequent Itemset mining (FIM) in a transactional database. In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1307.2991  شماره 

صفحات  -

تاریخ انتشار 2013